A Fast Filtration Algorithm for the Substring Matching Problem

نویسندگان

  • Pavel A. Pevzner
  • Michael S. Waterman
چکیده

Given a text of length n and a query of length q we p r w n t an dgorithm for finding dl locations of m-tupiu in the Cut m d ia the query that differ by at most t mismatches. Thia problem ia motivated by the dot-matrix constructions for q u e n c e comparison and optimal oligonodcotide probe selection routinely used in molecolu biology. In the c~de q = m the problem coindda with the classical apprwzirnate string mulching with k rniamatches problem. We pr-t a new approach to thk problem b w d on multiple filtration which may have dnntagca over wme rophistieated and theoretically efficient math& that hare been proposed. This paper describes a t-stye procers. The h t r tqe (mdtipk filtration) uses a new technique to prwlect roughly rimiLr m-tuplrs. The m u d rtage c o m p u a these m-tuples wing .IL accurate method. We demonstrate the advantages of multiple filtr&ioa in comparison with other techniques for approximate pattern matching. '

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Filtration Algorithms for Approximate Order-Preserving Matching

The exact order-preserving matching problem is to find all the substrings of a text T which have the same length and relative order as a pattern P . Like string maching, order-preserving matching can be generalized by allowing the match to be approximate. In approximate order-preserving matching two strings match if they have the same relative order after removing up to k elements in the same p...

متن کامل

Order-Preserving Matching with Filtration

The problem of order-preserving matching has gained attention lately. The text and the pattern consist of numbers. The task is to find all substrings in the text which have the same relative order as the pattern. The problem has applications in analysis of time series like stock market or weather data. Solutions based on the KMP and BMH algorithms have been presented earlier. We present a new s...

متن کامل

A New Filtration Method Based on the Locality Property for Approximate String Matching

In this paper, we consider the approximate string matching problem. We give a method to eliminate candidate locations in text T as there can be no substring ending at those locations such that the edit distance between and pattern S S P is smaller than or equal to a specified error bound . Our method is simple to implement. Experimental results show that our method is effective. For instance, f...

متن کامل

A New Algorithm for Fast All-Against-All Substring Matching

We present a new and efficient algorithm to solve the ’threshold all vs. all’ problem, which involves searching of two strings (with length N and M respectively) for finding all maximal approximate matches of length at least S and with up to K differences. The algorithm is based on a novel graph model, and it solves the problem in time O(NMK).

متن کامل

A Fast Order-Preserving Matching with q-neighborhood Filtration Using SIMD Instructions

The order-preserving matching problem is a variant of the pattern matching problem focusing on shapes of sequences instead of values of sequences. Given a text and a pattern, the problem is to output all positions where the pattern and a subsequence in the text are of the same relative order. Chhabra and Tarhio proposed a fast algorithm based on filtration for the order-preserving matching prob...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993